TakeLab-QA at SemEval-2017 Task 3: Classification Experiments for Answer Retrieval in Community QA
نویسندگان
چکیده
In this paper we present the TakeLab-QA entry to SemEval 2017 task 3, which is a question-comment re-ranking problem. We present a classification based approach, including two supervised learning models – Support Vector Machines (SVM) and Convolutional Neural Networks (CNN). We use features based on different semantic similarity models (e.g., Latent Dirichlet Allocation), as well as features based on several types of pre-trained word embeddings. Moreover, we also use some handcrafted task-specific features. For training, our system uses no external labeled data apart from that provided by the organizers. Our primary submission achieves a MAP-score of 81.14 and F1-score of 66.99 – ranking us 10th on the SemEval 2017 task 3, subtask A.
منابع مشابه
GW_QA at SemEval-2017 Task 3: Question Answer Re-ranking on Arabic Fora
This paper describes our submission to SemEval-2017 Task 3 Subtask D, ”Question Answer Ranking in Arabic Community Question Answering”. In this work, we applied a supervised machine learning approach to automatically re-rank a set of QA pairs according to their relevance to a given question. We employ features based on latent semantic models, namely WTMF, as well as a set of lexical features ba...
متن کاملLatent Space Embedding for Retrieval in Question-Answer Archives
Community-driven Question Answering (CQA) systems such as Yahoo! Answers have become valuable sources of reusable information. CQA retrieval enables usage of historical CQA archives to solve new questions posed by users. This task has received much recent attention, with methods building upon literature from translation models, topic models, and deep learning. In this paper, we devise a CQA ret...
متن کاملQuestion Answering as a Classification Task
In this paper we treat question answering (QA) as a classification problem. Our motivation is to build systems for many languages without the need for highly tuned linguistic modules. Consequently, word tokens and web data are used extensively but no explicit linguistic knowledge is incorporated. A mathematical model for answer retrieval, answer classification and answer length prediction is de...
متن کاملChinese QA and CLQA: NTCIR-5 QA Experiments at UNT
This paper describes our participation in the NTCIR-5 CLQA task. Three runs were officially submitted for three subtasks: Chinese Question Answering, English-Chinese Question Answering, and Chinese-English Question Answering. We expanded our TREC experimental QA system EagleQA this year to include Chinese QA and Cross-Language QA capabilities. Various information retrieval and natural language ...
متن کاملSCIR-QA at SemEval-2017 Task 3: CNN Model Based on Similar and Dissimilar Information between Keywords for Question Similarity
We describe a method of calculating the similarity between questions in community QA. Questions in cQA are usually very long and there are a lot of useless information about calculating the similarity between questions. Therefore, we implement a CNN model based on similar and dissimilar information on questions keywords. We extract the keywords of questions, and then model the similar and dissi...
متن کامل